AITopics | visual structure constraint

Neural Information Processing Systems http://nips.cc/

projection function, unseen class, visual structure constraint, (13 more...)

Neural Information Processing Systems

Country:

Asia > China > Hong Kong (0.04)
North America > Canada (0.04)
Asia > Middle East > Jordan (0.04)
Asia > China > Guangdong Province > Shenzhen (0.04)

Industry: Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.68)
(2 more...)

Add feedback

Transductive Zero-Shot Learning with Visual Structure Constraint

Neural Information Processing SystemsDec-25-2025, 10:52:10 GMT

To recognize objects of the unseen classes, most existing Zero-Shot Learning (ZSL) methods first learn a compatible projection function between the common semantic space and the visual space based on the data of source seen classes, then directly apply it to the target unseen classes. However, in real scenarios, the data distribution between the source and target domain might not match well, thus causing the well-known domain shift problem. Based on the observation that visual features of test instances can be separated into different clusters, we propose a new visual structure constraint on class centers for transductive ZSL, to improve the generality of the projection function (\ie alleviate the above domain shift problem). Specifically, three different strategies (symmetric Chamfer-distance,Bipartite matching distance, and Wasserstein distance) are adopted to align the projected unseen semantic centers and visual cluster centers of test instances. We also propose a new training strategy to handle the real cases where many unrelated images exist in the test dataset, which is not considered in previous methods. Experiments on many widely used datasets demonstrate that the proposed visual structure constraint can bring substantial performance gain consistently and achieve state-of-the-art results.

name change, transductive zero-shot learning, visual structure constraint, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.30)

Add feedback

Transductive Zero-Shot Learning with Visual Structure Constraint

Ziyu Wan, Dongdong Chen, Yan Li, Xingguang Yan, Junge Zhang, Yizhou Yu, Jing Liao

Neural Information Processing SystemsOct-2-2025, 19:41:33 GMT

However, it is unrealistic to label all the object classes, thus making these supervised learning methods struggle to recognize objects which are unseen during training.

artificial intelligence, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Country: Asia > China (0.28)

Industry: Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.68)
(2 more...)

Add feedback

Reviews: Transductive Zero-Shot Learning with Visual Structure Constraint

Neural Information Processing SystemsJan-24-2025, 00:53:13 GMT

Strength: - The paper proposes an interesting and novel approach for transductive zero-shot learning. It would be great to also include zero shot performance on ImageNet (this is most likely missing as there are not attribute annotations for ImageNet, but the approach does not seem to be limited to attributes for transfer) 1.2. It would be interesting to quantitatively compare to [31] and [34] as ablations of the author's appraoch from which authors took inspiration. The authors claim in the reproducibility checklist to have "Clearly defined error bars" and "A description of results with central tendency (e.g. The paper misses to discuss (qualitatively and quantitatively) recent related work including [A].

imagenet, transductive zero-shot learning, visual structure constraint, (3 more...)

Neural Information Processing Systems

Genre: Research Report (0.73)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.90)

Add feedback

Reviews: Transductive Zero-Shot Learning with Visual Structure Constraint

Neural Information Processing SystemsJan-24-2025, 00:53:02 GMT

The submission originally received scores mixed region that put it into the borderline region. The reviewers praised the simple and apparently effective method, but also noted a number of issues, in particular an unclear relation to [34] (which itself is rather unclear) as well as an insufficient experiment evaluation. In their response the authors provided additional information and results, which the reviewers appreciated. A detailed discussion followed, that ultimately let to the conclusion that the contribution is valuable and that authors should not be punished for a lack of clarity in the prior work [34]. Therefore, the recommendation is to accept the work.

reviewer, transductive zero-shot learning, visual structure constraint

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.40)

Add feedback

Transductive Zero-Shot Learning with Visual Structure Constraint

Neural Information Processing SystemsOct-10-2024, 02:51:30 GMT

To recognize objects of the unseen classes, most existing Zero-Shot Learning (ZSL) methods first learn a compatible projection function between the common semantic space and the visual space based on the data of source seen classes, then directly apply it to the target unseen classes. However, in real scenarios, the data distribution between the source and target domain might not match well, thus causing the well-known domain shift problem. Based on the observation that visual features of test instances can be separated into different clusters, we propose a new visual structure constraint on class centers for transductive ZSL, to improve the generality of the projection function (\ie alleviate the above domain shift problem). Specifically, three different strategies (symmetric Chamfer-distance,Bipartite matching distance, and Wasserstein distance) are adopted to align the projected unseen semantic centers and visual cluster centers of test instances. We also propose a new training strategy to handle the real cases where many unrelated images exist in the test dataset, which is not considered in previous methods. Experiments on many widely used datasets demonstrate that the proposed visual structure constraint can bring substantial performance gain consistently and achieve state-of-the-art results.

domain shift problem, transductive zero-shot learning, visual structure constraint, (2 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.64)

Add feedback

Transductive Zero-Shot Learning with Visual Structure Constraint

Wan, Ziyu, Chen, Dongdong, Li, Yan, Yan, Xingguang, Zhang, Junge, Yu, Yizhou, Liao, Jing

Neural Information Processing SystemsMar-20-2020, 13:30:57 GMT

To recognize objects of the unseen classes, most existing Zero-Shot Learning (ZSL) methods first learn a compatible projection function between the common semantic space and the visual space based on the data of source seen classes, then directly apply it to the target unseen classes. However, in real scenarios, the data distribution between the source and target domain might not match well, thus causing the well-known domain shift problem. Based on the observation that visual features of test instances can be separated into different clusters, we propose a new visual structure constraint on class centers for transductive ZSL, to improve the generality of the projection function (\ie alleviate the above domain shift problem). Specifically, three different strategies (symmetric Chamfer-distance,Bipartite matching distance, and Wasserstein distance) are adopted to align the projected unseen semantic centers and visual cluster centers of test instances. We also propose a new training strategy to handle the real cases where many unrelated images exist in the test dataset, which is not considered in previous methods.

domain shift problem, transductive zero-shot learning, visual structure constraint, (2 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.65)

Add feedback